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Abstract 

This paper deals with the design of low-rate sparse-graph codes with linear minimum distance (d m i n ) in the 
blocklength. First, we define a necessary condition that a quite general family of graphical codes has to satisfy in 

O ,n 

^vq order to have linear e? m i„.The condition generalizes results known for turbo codes |9| and LDPC codes. Secondly, 

O « . . « fc _ to _ . „ . „ u, , ^ structure, * 

designing an efficient low-rate code. As a final result of our investigation, we present a new ensemble of low-rate 
codes, designed under the necessary condition and having bits of degree 1. The asymptotic analysis of the ensemble 
shows that its iterative threshold is close to the Shannon limit. It also has linear tl mm , a simple structure and enjoys 
O a low decoding complexity and a fast convergence. 



I. Introduction 

ON 

Low rate codes play a crucial role in communication systems operating in the low signal-to-noise 

O 

O ratio regime, such as power-limited sensor networks, ultra-wideband communications schemes and code- 



> spread CDMA systems. More recently, it has also been found out that powerful low-rate codes with 



X 

5-i a fast decoding algorithm can be used in the reconciliation phase of continuous-variable quantum key 



distribution protocols and allow to increase significantly the range of the protocol [19J. 

Since the invention of turbo codes [8|, a lot of effort was put into designing sparse-graph codes for 
various applications. This is due to nice features of the iterative decoding algorithm which is used in such 
codes, namely its low decoding complexity and good performance. Although the design of good low-rate 
sparse-graph codes is of great interest, it is not straightforward. By a good low-rate code ensemble we 
mean an ensemble with iterative threshold close to the channel capacity and a good minimum distance, 
which is necessary to obtain a low error floor. The problem lies in the fact that, in order to design a 
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low rate code with performance close to the channel capacity, it seems crucial to have a large fraction of 
variable nodes of degrees 1 and 2 in the code structure But the presence of a large number of variable 
nodes of low degrees is not favorable for the minimum distance growth. It may become logarithmic or, 
even worse, constant. This phenomenon has been quantified in several papers, such as for instance in 
ll2T1l . 11271 . Il2~0"l . A way to circumvent the problem is to introduce some structure in the bipartite graph of 
a low-rate ensemble, preventing the formation of low- weight codewords. 

Recently, some high-performance low-rate schemes have been proposed. A rate-1/10 multi-edge LDPC 
ensemble with the threshold -1.09 dB on the AWGN channel was presented in 11231 . This construction can 
be viewed as a serial concatenation of a (3,15) LDPC code and of an LT code and it possesses a complex 
structure. Its minimum distance growth inherits the minimum distance property of the underlying (3,15) 
LDPC inner code, i.e. it is linear in blocklength. In lfT2l . authors introduced low-rate ARA-type LDPC 
codes of different rates in the range from 1/3 to 1/10. The proposed ensembles have iterative thresholds 
close to the channel capacity and a simpler structure, compared to the previous multi-edge ensemble, but 
their minimum distance grows only polynomially in the blocklengthQ Also, in [18], authors presented a 
parallel concatenation of Zigzag-Hadamard (ZH) codes. These codes are decoded in a turbo-like fashion, 
by using the fast Hadamard transform for small Hadamard component codes. This yields a rather low 
complexity decoding algorithm. The concatenated ZH ensembles have rates down to 0.00105. As for their 
minimum distance, the reasoning from [|28l can be adapted to show that the minimum distance of such 
a construction is of order rv- M ~ x '' M , where n is the blocklength and M is the number of component ZH 
codes. This case is treated in 

In this work, we propose an alternative low-rate code structure, which enjoys a good minimum distance, 
a good iterative threshold and a low decoding complexity. Our approach avoids to fix a complex bipartite 
graph structure and enables to get a flexible irregular construction. Hence, the degree distribution of this 
construction can be optimized by a simple one-dimensional optimization. The procedure that we adapt is 

'more precisely, it is of order 0(n 3 / 4 ), see [6| 
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the following: a) we first provide a necessary condition to ensure linear minimum distance, b) then we 
design a low-rate code ensemble which satisfies this condition based on a component code that enjoys a 
low-complexity decoding algorithm. To fulfill the first point, we define a special graph, called the graph 
of codewords of partial weight 2. This graph is derived from connections of variable nodes of degrees 
1 and 2 and of low-weight codewords of component codes. In some sense, it is a generalization of the 
subgraph induced by degree-2 variable nodes for LDPC codes [11 J to any sparse-graph code ensemble. 

Tail-biting Trellis LDPC (or TLDPC) codes have been introduced in Hi, 0. This family enjoys an 
iterative threshold close to the channel capacity, a linear minimum distance and a very low decoding 
complexity. Examples of TLDPC codes of rates 1/3 and 1/2 were presented in [0, 0. In this paper, 
we utilize the framework of TLDPC codes to design a code ensemble of lower rate. We propose a new 
TLDPC component code, having a very simple structure The proposed component code has an interesting 
feature, which makes the obtaining of linear minimum distance possible: the supports of its low-weight 
codewords are distributed among the code positions in such a way that the union of intersecting supports 
form disjoint clusters. We will discuss this property in details later on in the paper. We also emphasize 
that our choice of the component code allows to have a large non-zero fraction of degree- 1 bits in the 
code structure, while keeping the minimum distance grow linearly in the blocklength. The presence of 
degree- 1 bits improves the performance of iterative decoding, it will be explained later on in the paper. 

To design a low-rate TLDPC ensemble both with linear minimum distance and an iterative threshold 
close to the channel capacity, we put a constraint on the maximum allowed fraction of degree-2 variable 
nodes and optimize over the degree distribution of variable nodes by using EXIT charts. Moreover, in order 
to satisfy the necessary condition for linear minimum distance we have found, we propose a structured way 
to generate the permutation for edges connected to degree-2 variable nodes. There is no other constraint 
on the generation of the permutation for other edges in the bipartite graph, it is assumed to be drawn 
uniformly at random. 

The paper is organized as follows. In the Section [II] the graph of codewords of partial weight 2 and 



a necessary condition for linear minimum distance are provided. Section ITTT] gives an insight why it is 



important to put degree- 1 variable nodes in the graphical structure. A general introduction to TLDPC 



codes and a presentation of the new low-rate ensemble are given in Section IV Numerical results are 



shown in Section [V] Section VI contains some discussion on the topic. 



II. Necessary Condition for Linear Minimum Distance 

A. Common Representation for Sparse-Graph Codes 

For the sake of generality, we use the following general representation for all sparse-graph codes ll26ll : 
Definition 1 (Common construction and base code): The construction produces a binary code of length 
n with the help of two ingredients: 

(i) a binary code B of rate Rb of length m, with m > n. This code is called the base code; 

(ii) a bipartite graph between two sets V and W of vertices of size n and m respectively, where the 
degree of any vertex in W is 1 and the degree of the vertices in V is specified by a degree distribution 
A = (Ai, A2, • • • , A s ) where Aj denotes the fraction of edges, incident to vertices of V of degree i. 

The bipartite graph together with the base code specifies a code of length n as the set of binary 
assignments of V such that the induced assignments^] of vertices of W belong to B. It is straightforward 
to check that the rate of the code obtained by this construction is at least equal to the designed rate R, 

R d ^ f 1 - (1 - R h )\ 
where A is the average left degree, which is given by 



^def m 



n £ 



Ai 
% i 



It is common to present the degree distribution A in its polynomial form A(x) = J2i=i 
Most sparse-graph code constructions can be viewed as a particular instance of this construction: 
Example [LDPC codes] The LDPC base code is the juxtaposition of parity codes; A(x) is the left 
degree distribution.O 

2 a vertex in W receives the same assignment as the vertex in V it is connected to. 
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When the bipartite graph has some special structure, we say that it is a structured code ensemble. 

Example [Parallel turbo codes] The base code of a parallel turbo ensemble is the juxtaposition of several 
convolutional codes, the positions of which are divided into two subsets, the first one is formed by the 
information bits and the second one by the redundancy bits. The sets V and W in the bipartite graph are 
also divided into two subsets, the subsets (information and redundancy).A node in V and a node in W 
can be connected only if they belong to the same subset type and all redundancy nodes have degree 1 . O 

The standard decoding procedure ll24ll for sparse-graph codes is the following. At each iteration, base 
code decoding is performed in order to get extrinsic messages for bits of B, then intrinsic messages at 
the variable node side are calculated. After some number of iterations, a posteriori messages of code bits 
are computed. The decoding complexity therefore depends on the complexity of the base code decoding, 
on the degree distribution of variable nodes (the higher the node degree the more complex decoding gets) 
and on the number of iterations which are needed to be performed (i.e. the decoding convergence speed). 

B. Graph of Codewords of Partial Weight 2 

A position in the base code B is said to have degree i if it is connected to a node of degree i in V. 
Notice that here we allow variable nodes to be of degree > 1 and, therefore, we allow positions of degree 
1 in B. The location of these positions has a crucial impact on the minimum distance of the overall code, 
which may become constant in the worst case. In what follows, this case is supposed to be avoided. To 
study the minimum distance behavior, we make two following definitions. 

Definition 2 (Codewords of B of partial weight 2): Codewords of B of partial weight 2 are the code- 
words that involve exactly two non-zero positions of degree > 1. 

Definition 3 (Clusters): A cluster is an ensemble of positions of degree > 1 in B, so that for any two 
positions i and j from this ensemble there exists a codeword of partial degree 2 in B containing i and j. 
The simplest example of clusters can be given in the case of LDPC codes. 

Example [Clusters for LDPC codes, Ai = 0] Any two positions of the same parity code form the support 
for one codeword of partial weight 2. Thus, clusters correspond to ensembles of positions belonging to 
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the same parity codes. O 

With this notion of cluster, we can define now the graph of codewords of partial weight 2: 

Definition 4 (Graph of codewords of partial weight 2): The graph of codewords of partial weight 2 is 
a graph G = \ V, e\ with vertex set V and edge set E. V is equal to the set of clusters and there is an 
edge Cij between two clusters v { and Vj iff there exist two positions x k and xi of the base code, belonging 
to the clusters Vi and Vj respectively, which join the same degree-2 variable node. 

Example [Graph G for LDPC codes] The graph G for a LDPC code contains clusters that correspond 
to parity checks in the code structure. Two clusters are connected if their corresponding parity checks are 
connected through a degree-2 variable node in the bipartite graph of the code. O 

C. Cycles in the Graph of Codewords of Weight 2 and Its Average Degree 

It is well known ifTOll that the first source of low weight codewords when an LDPC code is chosen at 
random are cycles in the Tanner graphs containing only variable nodes of degree 2. Let us show that they 
are in one-to-one correspondence with cycles in G, which will allow us to state the necessary condition 
on linear minimum distance (d min ). To do it, we need two following definitions. 

Definition 5 (Node weight): For a node v in V and two edges % and j connected to it, we define a node 
weight w\a as follows. By the very definition of a cluster and of the graph G, these two edges correspond 
to two positions of degree 2 and they form together with a certain number a of positions of degree 1 the 
support of a codeword of partial weight 2. We let • be equal to this number a. 

Definition 6 (Cycle weight): The weight I of a cycle C = (V c , E c ) in G is equal to I = \E c \ + ^2 veVc w v , 
where w v is the node weight associated with vertex v in Vc and the two edges in Ec connected to v. 

Here is a fundamental relation between cycles in G and low-weight codewords of B: 

Proposition 1: A cycle of weight I in G induces a codeword of weight I in the sparse-graph code. 
Proof: If C = (Vc,Ec) is a cycle in G, we associate to it a configuration x = (xi,x 2 , • • • ,x m ) of 
positions of the base code in which a) positions of the base code of degree 2 are set to 1 if in the Tanner 
graph they are connected to the variable nodes of degree 2 that are associated with edges in Ec; b) a set 
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B of positions of degree 1 is set to 1 if they form a codeword of the base code of partial weight 2 with 
two corresponding positions of degree 2; c) all other positions in x are set to 0. 

Denote by vf the size of the set B for a node v E Vq- The point is that the configuration x is obviously 
a codeword of the base code. It has weight 2\Ec\ + J2 v ev c wV - ^l-^cl non-zero bits of x are connected 
to degree-2 variable nodes and the rest of them is connected to degree- 1 variable nodes. Thus, there are 
\Ec \ + J2v&v c wV variarj le nodes participating in the configuration x, and they correspond to a codeword 
of weight \E C \ + J2vev c w "- ■ 

Notice that the weight of the smallest cycle in G is an upper bound on d rnin . Therefore: 

Corollary 1: If all the node weights w'"j of a given graph G are smaller than some constant a > 0, 
a E N, then the minimum distance of its corresponding sparse-graph code is upper bounded by (a+1) \Ec |. 

Corollary 2: If G contains a cycle of logarithmic weight, d m i n of the code is logarithmic in the n. 

Corollary [2] can be equivalently expressed in terms of the average degree of G. 

Theorem 1 (Upper bound on d m i n ): Consider a sparse-graph code for which the corresponding graphs 
G have node weights upper bounded by a small positive integer a. If all the average degrees of these 
graphs is greater than 2 + e for some e > 0, then d m i n grows logarithmically in n. 

Proof. Consider a sparse graph code. Let G be the associated graph of codewords of partial weight 
2 and <i min be the minimum distance of the code. Let g be the girth of G and A be its average degree. 
By Corollary [T] we know that <i m i n < (a + l)g. To upperbound this last quantity we use the Moore 
bound for irregular graphs Q~| which asserts that the number of vertices n of G satisfies the following 
inequality n > 2 ^ A ~ 1 j 2 ~ 1 where t = |_fj- This implies t < log A _ x {^j^n + l) . We now conclude by 
dmin <(a + l)g < (a + l)(2t + 1) < (a + 1) (21og A _! (^n + l) + l) . □ 

D. Necessary Condition 

The following necessary condition follows immediately: 

While constructing a sparse-graph code ensemble with a linear growth of the average minimum 
distance, cycles of sublinear weights in the corresponding graph G of codewords of partial weight 



2 must be avoided. Or, equivalently, the average degree A of G must be smaller than or equal to 2. 
Example [Case of LDPC codes] Consider an LDPC code ensemble. Let A2 be the fraction of its degree- 
2 variable nodes and let p be the average degree of its check nodes (p — ™, where r is the number of 
check nodes and m is the number of edges). The number of clusters is equal to r. To satisfy the necessary 
condition above, G should not have more than r edges. So, there should be at most r variable nodes of 
degree 2 in the bipartite graph. There are ^7p of such nodes. As 

-V < r = -; X 2 p < 2; 
2 p 

If we want rf min therefore the necessary condition becomes \ 2 p < 2. O 

Note that A = 2 is the critical case, when G contains one or several cycles of linear length. It has 
been shown in [|28l that for LDPC codes and A = 2, the minimum distance is polynomial in n. For more 
general families of sparse-graph codes this is not true anymore, see for instance Section V of EOl . 

Until now we dealt with codes with bounded node weights. For some codes the node weights are 
unbounded, e.g. for turbo codes. With a little work, our results can be still extended to unbounded 
weights, and Corollary [2] and Theorem [T] will hold. For completeness of the demonstration, we elaborate 
the bound for parallel turbo codes, which leads to a much shorter proof of the result by Breiling 0. 

Theorem 2 (7[9]/): d min of parallel turbo codes grows at most logarithmically in n. 

Proof: For simplicity, assume only two convolutional components, that both component encoders are 
recursive systematic convolutional encoders of type (n, 1) and that they are equal. Then, there exists t 
such that for any information position i in the convolutional code there is a codeword of partial weight 
2 with information support {i, i + t} and with redundancy weight w. Other codewords of partial weight 
2 are deduced by addition. They have information support {i, i + kt}, their redundancy weight is at most 
kw and they all belong to the same cluster in G. Therefore, G consists of (at most) 2t clusters [^J which 
are connected through iV edges, iV is the number of information bits in the turbo code. 

3 The factor 2 comes from the fact that there are two convolutional codes each one coming with its own set of clusters. 
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Note that the node weights of the clusters are unbounded. To circumvent this difficulty, we form smaller 
clusters by partitioning each cluster into subclusters of size 3 of the form {i, i + 1, i + 2t}. We obtain a 
new graph of codewords of partial weight 2, denoted by G', with 2N/3 + 0(1) clusters and of degree 
3. Moreover, the node weights of G' are bounded by 2w. Therefore, G' has a cycle of size at most 
21og 2 (2iV/3 + 0(1)) and of weight at most 2(1 + 2w) log 2 (2iV/3 + 0(1)). This yields a codeword of 
weight 2(1 + 2w) log 2 (2iV/3 + 0(1)) in the turbo code by Proposition [TJ ■ 

III. On the usefulness of designing sparse graph codes with degree one nodes 

It is worthwhile to quote [24] here: "Given the importance of degree-two edges, it is natural to conjecture 
that degree-one edges could bring further benefits". This statement can be illustrated by the observation 
that, from one hand, turbo codes require in general less decoding iterations than LDPC codes and tend to 
outperform LDPC codes at short blocklengths, and, from the other hand, they are decoded with a graphical 
structure having bits of degree 1, absent in the case of LDPC codes. Another confirming example is given 
by 1)221 Table VIII], where a small fraction of bits of degree 1, present in an LDPC ensemble, allows to 
obtain a much steeper waterfall region than in the case of conventional LDPC codes. 

Obtaining codes with a steep waterfall region and a moderate number of decoding iterations becomes 
problematic in the case of low code rates: the number of iterations needed to converge increases when 
R decreases (and may go up to several hundreds!), and the error rate curves become very flat. The main 
purpose of this section is to investigate these two phenomena (number of decoding iterations and steepness 
of performance curves) and to present a heuristic explanation for them, which would give us an insight 
on the design of efficient low-rate codes. Being a bit ahead, let us mention that, in order to design a good 
low-rate ensemble, one should include bits of degree 1 in the code structure]^] 

The explanation is given with the help of an EXIT chart on the binary erasure channeQ(BEC). For the 

4 or "hidden" bits of degree 1 in the case of LDPC codes decoded in a turbo-like manner, namely LDPC codes for which all parity-checks 

involve at least two bits of degree 2. 

5 Note that the same kind of explanation can also be given for other channels by asserting that the fundamental relation, namely Theorem 

[T| which holds for the binary erasure channel, holds approximately for other channels. 
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BEC, the EXIT chart predicts accurately the infinite-length behavior of the code ensemble, and represents 
the "average" trajectory for finite blocklengths, which is given by horizontal and vertical steps between 
two EXIT curves, the curve of variable nodes and the curve of the base code. Iterative decoding is typically 
successful if and only if the curve of variable nodes is above the curve of the base code. The area AA 
between both curves has a very nice interpretation : it is linked with the distance to capacity. It was 
observed in Q (generalization of the result first proved in 112510 that, in order to get a capacity-achieving 
sequence of codes in the sense of ll25l . the quantity A A in the sequence should go to 0. 
To present our explanation, let us define the EXIT curves. 

1) the EXIT curve of the variable nodes : When Ai = and the channel erasure probability is p, this 
curve is given by the set of points (p\(x),x) where x ranges over [0, 1]. When Ai > 0, this curve 
is given by the set of points J^( p( - x ^ x }~ Xl \ x) , x G [0, 1] j. If we bring the degree distribution of the 



edges of degree > 1, 



A?= f -^V (1) 

1 — Ai 



for i > 1 (and Ai = 0) and the associated polynomial 

A(x) = ^A^- 1 = ^ r A r x- 1 , (2) 

i>l »>1 1 

then the EXIT chart of variable nodes is given by the curve |(pA(ar), x), x E [0, 1] 
2) The EXIT chart of the base code is the curve which relates the fraction of erased messages, ingoing 
to the base code, with the fraction of outgoing erased messages after the base code decoding, 
under assumption of the infinite base code length. In some cases this EXIT curve can be described 
analytically. For example, for a right-regular LDPC code, this curve is given by the set of points 

{fai-ii-xy-^xe [o,i]}. 

For the infinite-length case, the iterative decoding converges if and only the base code EXIT curve lies 
below the EXIT curve of the variable nodes. The statement we are going to give below is not really stated 
in 0, but is in essence only a corollary of the results given in this paper. 

6 i.e. successful with probability tending to 1 as n — > oo 
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Theorem 1: [Area theorem] Let A A be the area between the two EXIT curves. Then 

AA - °(P) ~ R 
~ A(l - Ai) ' 

where C(p) is the capacity of the BEC with probability p, C(p) = 1 — p. 

The proof of the theorem is given in Appendix. Note that this result raises several comments: 

• For the same gap to capacity and fixed A, the area between the EXIT curves of variable nodes and 
of the base code is larger in the presence of degree- 1 nodes than without them by a factor of jz^ - - 
This can be quite significant, if Ai is large. 

• Although the number of iterations does not necessarily decreases with AA (because it also depends 
on the shapes of both EXIT curves), in many cases it does. As the presence of degree- 1 nodes makes 
two EXIT curves to lie far from each other, it helps to decrease the number of iterations. Note that 
turbo-codes, especially low rate turbo-codes, have a large Ai, which may explain the small amount 
of iterations needed for their convergence, in comparison to LDPC codes, decoded by the standard 
Gallager algorithm and not having degree- 1 nodes at all. 

• Increasing of AA has also a positive influence on the slope in the waterfall region as it was put 
forward in lfl"5ll . lfT6l . ifTTl . This might be the explanation why turbo-codes are believed to outperform 
LDPC codes for moderate lengths. In this case, it is essential to have a steep waterfall regiorj^J 

A straightforward corollary of Theorem [T] is 
Corollary 3: 

dAA 1 



dp A(l-Ai)' 

To illustrate this point, let us consider an example of a particular TLDPC code family, which will 



be defined in Section IV It has the rate R — ^ and the fraction Ai = |. This code family is almost 
capacity-achieving for the BEC, where it corrects up to 89.6% channel erasures: for p = 0.896, two 

7 We refer the reader to for a rigorous derivation of the exponential behavior of the error probability, suggested in the aforementioned 
references, shown for LDPC codes over the BEC. The generalization of the result on turbo-like ensembles is given in [3 ]. For a generalization 
of the formulas from (2) to more general channels, see 1141 . 1131 . 
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EXIT curves, drawn by straight lines in Fig{T] touch each other. FigjT] also presents the EXIT curves 
(dashed lines), obtained for p = p — 0.07. One can see that they lie much further apart, as predicted. 
Now, to estimate qulitatively the speed of moving of two EXIT curves, let us compare them with the 
EXIT curves of an LDPC code ensemble of rate ^. For this, we choose an LDPC ensemble with check 
nodes of degrees 2,3 and 4, the edge connections to which are described by the check degree distribution 
p(x) = + \x 2 + |x 3 (see [|24| for definition of p(x)). Such a choice of p(x) makes the shapes 
of EXIT curves for the TLDPC base code and for the LDPC base code similar to each other, which 
allows to have a fair comparison. To design an LDPC code with parameters similar to those of the 
TLDPC code, i.e. of rate close to X and with maximum variable node degree 12, we choose A(x) to be 
A(x) = 0.486x + 0.165x 2 + 0.037x 3 + 0.15x 4 + 0.132x 10 + 0.03X 11 . The ensemble has the rate R ns ^ 
and the threshold p « 0.8933. Figj2] shows its EXIT curves at the threshold p and for p = p — 0.07. 
At p = po — 0.07, the EXIT curves of the base code and of variable nodes are much closer than they are 
in the TLDPC case, as the EXIT curve of the base code does not change with p: it is always given by 
the function x t-> 1 — p(l — x). 

The situation becomes different in the presence of bits of degree 1, as in the TLDPC case. When the 
channel improves, the EXIT curve of the base code moves below of its initial position, obtained at the 
threshold p . The gain in the area is quantified by Proposition [2j and the area AAi between the EXIT 
charts of the base code at p and at p — Ap is given by 

AA 1 = T^Ap. 
1 — Ai 

As an example, Figj3] shows the area for given TLDPC code of rate i. This really accounts for the 
difference between the TLDPC case and the LDPC case and clearly results in a smaller number of 
decoding iterations, needed to converge. Moreover, the fact that the EXIT curves of the base code and of 
variable nodes lie further apart, is very likely to improve the slope in the waterfall region. 

Although the formula given in Proposition [T] seems to depend on Ai too, this quantity has no influence 
at all on how fast the variable node curve moves away with the decrease of p. Indeed, the area AA2 
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between the EXIT curves of variable nodes at p and at p — Ap is given by 

| - Ai Aw 
AA 2 = ^—^Ap=J^, 

where A= f — and the Aj's form the degree distribution of variable nodes of degrees > 1, as defined 

Ei>l ~i~ 

by ([T]). This is a consequence of the fact that the EXIT chart of variable nodes actually depends on 
A(x) (see and not on A(x). Such a dependency on A seems to suggest that, in order to improve the 
performance, one should try to design sparse-graph codes with A as small as possible. Ideally, one should 
get A = 2, which, by the way, is precisely the case for parallel turbo-codes. This consideration provides a 
heuristic explanation for the common belief that sparse graph codes with a small A give a good iterative 
decoding behavior for small and moderate lengths (i.e. the slope of the waterfall region). 

Also note that the case A = 2 corresponds to A(x) = x and, therefore, the EXIT curve of variable 
nodes is then the straight line x = py. Hence, an almost capacity-achieving ensemble in this case should 
be designed on a base code, the EXIT curve of which is close to x = py. We succeeded to obtain this 
behavior for base code curves of the TLDPC code family, defined in the following section. 

IV. TLDPC Ensemble of Rate 1/10 Satisfying the Necessary Condition on d min 

TLDPC codes is a structured code family, first proposed in [4j to meet the requirements of a low 
iterative decoding complexity, of linear d min and of iterative threshold close to the channel capacity. They 
can be viewed as a slight modification of LDPC codes which allow to have degree- 1 variable nodes by 
adding some state nodes to the graph structure. They differ from the multi-edge approach suggested in 
ll22ll in two points: (i) the TLDPC base code is not a juxtaposition of single parity-check codes but it 
is a tail-biting convolutional code with binary state nodes, (ii) its structure permits a one-dimensional 
optimization of A (2), and not a multi-dimensional optimization as is the case of multi-edge LDPC codes. 
They have been designed by using several construction methods, combined together; some of the methods 
apply to the base code, and some concern the bipartite graph. 
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A. Definition of TLDPC Codes 

1) Definition: For the moment, suppose that Ai = 0. Then the TLDPC base code is defined as follows: 

Definition 7 (TLDPC base code): The base code B of the TLDPC code is a tail-biting convolutional 
code, the Tanner graph of which is presented in Figj4j The »'s are associated with positions of the base 
code, white vertices with non-transmitted states, and the ©'s represent parity-check equations. The first 
and the last state nodes are identified. The number of «'s associated to the i-th © is denoted by &«. 

In the presence of degree- 1 bits (Ai > 0), the TLDPC base code is defined in a similar matter, yet the 
positions of degree 1 in B have to be specified. It is to mention that systematic RA (Repeat-Accumulate) 
codes, systematic IRA codes (Irregular Repeat-Accumulate) codes and most of the LDPC codes which 
are standardized^ are in fact a subclass of TLDPC codes, once they are decoded as a turbo-code and not 
as an LDPC code. All these codes have particular TLDPC base codes (see Fig. [5]), for which all frj's are 
equal to 1 for even values of i and where the corresponding variable nodes are all chosen to be of degree 
1. The positions of degree 1 are redundancy bits of the code. 

The important feature of the EXIT curve for the defined TLDPC base code is that it is close to a 
straight line (see previous section for the discussion on it). Moreover, the base code is not more complex 
to decode than single parity-check codes, and clearly is much easier to decode that the convolutional code 
in the underlying structure of of turbo codes. The TLDPC base code also allows to have a larger A2 under 
condition of linear d m i n , when compared with conventional LDPC codes, which is helpful for the speed 
of iterative decoding convergence and for the waterfall region. 

In order to design code ensembles with linear d min , one more constraint is to be put on the choice of the 



base code B, to satisfy the necessary condition given in Section II-D the clusters, formed by codewords 
of partial weight 2 in the designed base code, must have bounded weights. This condition ensures that 

8 i.e. those LDPC codes which have the same amount of degree 2 variable nodes as there are parity-checks and where these parity-check 
nodes are connected together by a single chain of degree 2 variable nodes. 
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the graph G has a linear number of clusters, and, hence, a non-zero fraction A2 may be allowed, with the 
condition on linear d min growth still satisfied. For an example, note that the condition is not verified for 
systematic IRA codes, having only one single cluster. 

2) Structure of the bipartite graph: A constraint on the permutation of edges, connected to degree-2 
variable nodes in the bipartite graph, comes from the necessary condition on linear d min . The permutation 
for edges connected to other variable nodes is generated randomly. 

The design of the code ensemble starts with the choice of the base code. Then, the optimization of 
the variable node degree distribution is performed, by fitting EXIT curves of variable nodes and of the 
base code, for a target code rate. As before, let the degree distribution, renormalized over the degrees 
> 1, be denoted by A(x). Let d c i uster be the average degree of clusters. Then, during the optimization, 
the renormalized fraction A 2 of edges connected to degree-2 variable nodes is required to be smaller than 
2/d c i uster , so that the average degree of G is smaller than 2. Suppose that A 2 < 2/d c i uster . At this moment 
some structure on G is to be chosen, so that G does not contain cycles. It seems that the simplest way 
would be to make G to be a union of disjoint paths. But, in this case, the prediction of the iterative 
threshold, given by the EXIT curve fitting, is not accurate because of the following reason: the EXIT 
method implicitly assumes that the positions of degree 2 in B are chosen independently of each other 
with probability A 2 . So, the expected fraction of clusters of degree i in G should be f s )A 2 (l — A 2 ) s ~\ if 
all clusters are of size s. To keep the prediction of the EXIT method accurate, degree-2 variable nodes 
are to be chosen such that the fraction of clusters of degree i is equal to the expected number. It is also 
needed to choose their positions in order to avoid cycles of sublinear length in G. 

B. Design of a Low-Rate Ensemble 

The design criteria, proposed above, were previously used in the design of TLDPC codes of rates 1/3 
and 1/2 in flU, 0, and gave very good results. The obtained iterative thresholds are within 0.2 — 0.5 
dB from the Gaussian channel capacity. Moreover, it has been proved that one of the code ensembles 
has d min , growing linearly in the blocklength. In this paper, we design a TLDPC ensemble of rate 1/10, 
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following the same construction methods. For our ensemble, it is possible allow a large non-zero fraction 
Ai and still to satisfy the necessary condition on linear d min . In what follows, a low-rate TLDPC base 
code and a permutation structure for degree-2 variable nodes are suggested. 

1) TLDPC base code of rate 1/2: With the aim of designing codes of low rates, we propose a TLDPC 
base code of rate 1/2, defined by the Tanner graph shown in Figj6} Note that here bi — 1 for any i. Each 
third section of the base code is chosen to be of degree 1, i.e. this position is connected to a degree- 1 
variable node in the bipartite graph. Positions of degree 1 are marked in blue in the figure. All other 
positions have degrees > 1. Such a base code gives rise to a code ensemble with Ai = |. 

As for the clusters in the graph G, they correspond to the pattern in the Tanner graph of the base code 
represented in Fig. [7J any two positions of degree > 1 in it give rise to a codeword of partial weight 2. 
The cluster degree equals to 4, and G contains as many clusters as there are such subgraphs in the Tanner 
graphs of the base code. To satisfy the necessary condition on the linear d m i n , A 2 should verify A2 < §. 

2) Degree optimization over the Gaussian channel and permutation structure for rate 1/10: Let us fix 

the design code rate equal to 1/10. We choose A 2 to be slightly less than |, namely A 2 = 0.4, in order to 

simplify the structure of G. First, let us compute the cluster degree distribution A = (a , ai, a 2 , a 3 , a 4 ), 

where represents the fraction of clusters of degree i in G. If the degree of clusters in G are chosen at 

random given A 2 = 0.4, the expected values of the the a/s would be the following figures: 

81 216 216 96 16 

ao " 625 ; ai " 625 ; a2 " 625 ; a " ~ 625 ; ° 4 " 625' 

We choose the a/s to be equal to these fractions for the reasons explained before. 

Let us find a structure of G with this degree distribution, so that G does not contain cycles. We choose it 

to contain the following components which we call "stars", "twigs" and "chains" (see Fig. [8]). Namely, we 

divide the Tanner graph of the base code into subgraphs similar to the one represented in Fig J7] and associate 

a cluster to each of them. We assume that the number of clusters M is divisible by 625. The generation 

of the bipartite graph is then performed by associating clusters in order to form the aforementioned 

components. It is straightforward to check that this is indeed possible. We summarize in Table [I] the 
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fraction of clusters consumed by each component. Note that, in Table [j] an entry for a given component c 
and a given degree i of the cluster corresponds to the fraction of clusters consumed in component c which 
are of degree i. Using the table, the following three points are easy to check: 1) All clusters are consumed 
in the components, because the sum of the entries of the column corresponding to any degree i gives etj. 
2) Each entry is nonnegative. 3) The "chains" are possible to form, as the number of clusters of degree 
1, used to form "chains", is even. After the degree optimization for the Gaussian channel, the following 
degree distribution was obtained: A{x) = 0.4a; + 0.264209x 2 + 0.090866a; 4 + 0.236716a; 8 + 0.008209a; 9 . 

V. Numerical Results 
Let us present performances of TLDPC codes of rate 1/10 and of lengths 6250, 18750, 50000 and 62500 



over the Gaussian channel. In each of these cases, A (a;) of (IV-B2) was adapted to the given blocklength. 
The corresponding word and bit error rates, obtained by simulations, are given in Figj9j The maximum 
iteration number was fixed to 200. It can bee seen in the figure that the estimated decoding threshold is 
about —0.8 dB, which corresponds to the value, obtained with the EXIT method. Notice that the threshold 
is only 0.5 dB away from channel capacity, equal to —1.286 dB. This is quite close for these signal to 
noise ratios, since the capacity at —0.8 dB is only about 0.111. In addition, numerical results did not 
catch the error-floor, which is expected to happen thanks to the good d min of designed codes. 

As for the convergence, for the largest simulated blocklength (62500) and at signal-to-noise ratio -0.5 
dB the decoder only needs 86 iterations in average to converge, due to the large fraction Ai- Moreover, 
as the base code can be represented by a 2-state trellis, where each trellis section carries only one bit, the 
complexity of one decoding iteration is very low. This results in a total low decoding complexity. 

VI. Discussion 

In this paper, two objectives were followed. The first one was to define a necessary condition to design 
sparse-graph codes with linear minimum distance in the blocklength. Such a condition has been found and 
is expressed either in terms of cycles or in terms of the average degree of the graph of codewords of partial 
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weight 2. The second objective was to design a new low-rate, structured code ensemble with such features 
as a linear d min , a small gap to the channel capacity, a low decoding complexity and also a possibility to 
apply well-developed techniques (EXIT charts, density evolution) to optimize the degree distribution of 
variable nodes. The aforementioned design has been performed in the framework of TLDPC codes, and 
a TLDPC code ensemble of rate 1/10 performing well over the Gaussian channel has been proposed. 

The linear minimum distance property for the presented TLDPC ensemble may be proved by using 
standard techniques based on weight distributions, for instance by computing the growth rate of the 
average weight distribution in the asymptotic case and to show that its first derivative at the origin is 
strictly negative. We do not present such a proof in the paper, but we conjecture such a behavior. 
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Appendix 

First recall ([7]) that the area under the EXIT curve for the variable nodes is given by 

i \ 

Proposition 1: A—l — p ^_ x ^ . 

Proof: The area below the curve of variable nodes is 

Jo 1 — Ai 1 — Ax 1 — Ai 1 — Ai 

The area below the EXIT chart of the base code is given by a corollary of B2l Theorem 1]: 
Proposition 2: Assume that the bits of degree 1 of the base code B can be completed to form an 
information set for B. Then the area A under the EXIT curve of the base code over the BEC is given by 
_ fl;,-(i--p)Ai ^ wnere ft b denotes the rate of the base code. 

Proof: From Theorem 1 (Q) we know that A = jfzr^w • Here V consists of a codeword of the 
base code which is chosen uniformly at random and Y is the transmitted codeword where all positions of 
degree > 1 have been erased and all positions of degree 1 have been erased with probability p. Let Z be 
the number of non-erased positions of V. Note that H(V\Z = t) = R b m — t, by the assumption made on 
the positions of degree 1. So, H(V\Y) = R^m— (1 — p)\im, and the proposition follows immediately. □ 
We are ready now for the proof of Theorem [T| 

Proof of Theorem [7] As long as the EXIT curve of the base code lies below the EXIT curve of the 
variable nodes, by Propositions [2] and [T] 



A„4 = 1 - p 



i - Ai R b -(1- p)Ai _ A(l - A x ) - p(l - AiA) - R b \ + (1 - p)AiA 



1-Ai 1-Ai A(l-Ai; 

l-R b )X-p C(p)-R 



A(l-Ai) A(l-Ax) 
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Fig. 3. A^i for the TLDPC code of rate ^ 




Fig. 4. Tanner graph of a TLDPC base code. 
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Fig. 5. Base code for systematic (I)RA codes and standartized LDPC codes. 
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Fig. 6. Tanner graph of a TLDPC base code of rate 1/2. 
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Fig. 7. Pattern in the Tanner graph of the TLDPC base code of rate 1/2 giving rise to a cluster. 
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Fig. 8. Configurations in the structure of the graph of codewords of partial weight 2. Clusters of different degrees have a different color. 
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Fig. 9. Performance of TLDPC codes of lengths (from right to left) 6250, 18750, 50000 and 62500 with Ai = 1/3 and the degree 
distribution ( |IV-B2[ l. Solid lines represent word error rates and dashed lines - binary error rates. 
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